# EFFICIENT DESIGN OF PULSE TRIGGERED FLIP-FLOP USING PASS TRANSISTOR LOGIC

Mr. S.Belwin Joel M.E<sup>1</sup>

Assistant professor, Dept. of ECE, James College of Engineering and Technology, Nagercoil, Tamilnadu. 1

#### belwinjoel@gmail.com

**Abstract:** Flip-flops are critical timing elements in digital circuits which have a large impact on circuit speed and power consumption. The performance of the flip-flop is an important element to determine the efficiency of the whole synchronous circuit. In an attempt to reduce power consumption in flip-flops a novel method is presented. An implicit type pulse triggered flip-flop is designed using conditional pulse enhancement scheme. The pulse generation logic use two input AND gate at its discharge path which reduces the circuit complexity and hence the overall area is reduced. Pulses for discharging are generated only when there is a need, so this reduces circuit activity and also provides faster discharge operation. So the extra power consumed can also be eliminated. The delay inverters which consume more power for stretching the pulse width are replaced by the PMOS transistors. Transistor sizes are also reduced to provide area and power saving. Power consumption is reduced compared to conventional methods.

Keywords: Flip-flop, AND Gate, PMOS transistor, Power save

#### I. INTRODUCTION

Flip-flops are the basic storage elements used to store one bit data. They are extensively used in all sequential designs. Nowadays in order to speed up the operations several pipelining techniques are used. These techniques mainly use flip-flops. Extensive power is consumed by the clock circuitry. About 20%-30% of the total system power is consumed by the clock distribution circuit itself. The main reason for this extensive power consumption is that the clock pulses have 100% transitions. On the other hand the logic circuit consumes only little power because the pulse transitions are very less compared to clock transitions. Conventional master slave flip-flops have two separate clocks for the master and slave and hence consume more power. A delay is introduced between the response to the input and the output. This provides the expected results but comparatively slower. Edge-Triggered flip-flops consume more power. Performance of the system is a major concern and the need for high speed circuits have increased. High speed in the sense it needs high clock frequency for its operation which in turn consumes more power.

Pulse triggered flip-flops can be used for low power operation. The term pulse-triggered means that data are entered into the flip-flop on the rising edge of the clock pulse, but the output does not reflect the input state until the falling edge of the clock pulse As this

kind of flip-flops are sensitive to any change of the input levels during the clock pulse is still HIGH, the inputs must be set up prior to the clock pulse's rising edge and must not be changed before the falling edge. There is a need of only one latch, so circuit complexity is lesser. Pulse triggered flip-flops are less sensitive to clock skew and jitter. They reduce the two stages in the master slave flip-flop into one stage and are characterized by the soft edge property. The logic complexity and number of stages inside these pulsetriggered flip-flops are reduced, leading to small D -to-Q delays. One of the main advantages of pulsetriggered flip-flop is that they allow time borrowing across cycle boundaries as a result of the zero or even negative setup time. Due to these timing issues, pulsetriggered flip-flops provide higher performance than their master-slave counterparts. It needs pulse generation logic for generating the control pulse.

Depending on the type of pulse generation, the flip-flop can be classified into implicit type and explicit type. Explicit type flip-flop requires additional circuitry for pulse generation. The pulse trains are generated physically. The pulse generator can be shared by the neighboring flip-flops. It is more energy efficient than implicit type designs. This logic consumes more power for operating the circuit and hence not suitable for low power significant designs. Also due to the presence of



large capacitive loads at the output i.e. if a B. MHLLF (Modified Hybrid Latch Flip-flop) pulse generator drives many number of flip-flops, the problem becomes more significant. It also has some pulse width control issues on applying low power techniques like conditional capture, conditional pre charge, conditional discharge or conditional data mapping.

Implicit type flip-flops have the generation logic inbuilt so there is no need for additional circuit and the power consumption is reduced. There is a need of control over the discharge path. It is a power efficient design. But due to the presence of longer discharge paths it has some inferior timing characteristics. On applying certain low power techniques the timing characteristics become very worse. So there is a need to enlarge the transistor sizes to produce wider pulses to trigger the data capturing of the flip-flops.

### II. IMPLICIT-TYPE P-FF DESIGN WITH PULSE CONTROL SCHEME

A. ip-DCO (implicit pulsed-Data Close to Output) Some conventional implicit-type P-FF designs, which are used as the reference designs in later performance comparisons, are first reviewed. A state-of-the-art P-FF design, named ip-DCO, is given in Fig 1(a). It contains an AND logic-based pulse generator and a semi-dynamic structured latch design. Inverters 15 and I6 are used to latch data and inverters I7 and I8 are used to hold the internal node. The pulse generator takes complementary and delay skewed clock signals to generate a transparent window equal in size to the delay by inverters I1-I3. Two practical problems exist in this design. First, during the rising edge, NMOS transistors N2 and N3 are turned on.



Figure 1(a). ip-DCO

An improved P-FF design, named MHLLF Fig.1 (b) MHLLF, by employing a static latch structure presented Node is no longer pre charged periodically by the clock signal. A weak pull-up transistor P1 controlled by the FF output signal Q is used to maintain the node level at high when Q is zero. This design eliminates the unnecessary discharging problem at node. However, it encounters a longer Data-to-O (D-to-O) delay during "0" to "1" transitions because node is not pre-discharged. Larger transistors N3 and N4 are required to enhance the discharging capability. Another drawback of this design is that node becomes floating when output Q and input Data both equal to "1". Extra DC power emerges if node X is drifted from an intact "1".



Figure 1 (b). MHLLF

## C. SCCER (Single Ended Conditional Capture Energy Recovery)

A refined low power P-FF design named SCCER using a conditional discharged technique. In this design, the keeper logic (back-to-back inverters I7 and I8 in Fig. 1(a) is replaced by a weak pull up transistor P1 in conjunction with an inverter I2 to reduce the load capacitance of node. The discharge path contains NMOS transistors N2 and N1 connected in series. In order to superfluous switching at node, an extra NMOS transistor N3 is employed. Since N3 is controlled by Q\_fdbk, no discharge occurs if input data remains high. The worst case timing of this design occurs when input data is "1" and node is discharged through four transistors in series This implies wider N1 and N2 transistors and a longer delay from the delay inverter I1 to widen the discharge pulse width. Use of transmission gate reduces the voltage



drop across the pass transistor and hence power dissipation gets reduced. It also doubles the area and interconnects but the overall size of the circuit gets reduced.



Figure 1(c). SCCER

# D. P-FF Design with conditional pulse enhancement scheme

The design, as shown in Fig. 1(d), adopts two measures to overcome the problems associated with existing P-FF designs. The first one is reducing the number of NMOS transistors stacked in the discharging path. The second one is supporting a mechanism to conditionally enhance the pull down strength when input data is "1." As opposed to the transistor stacking design in Fig. 1(a) and (c), transistor N2 is removed from the discharging path. Transistor N2, in conjunction with an additional transistor N3, forms a two-input pass transistor logic (PTL)-based AND gate to control the discharge of transistor N1. Since the two inputs to the AND logic are mostly complementary (except during the transition edges of the clock), the output node is kept at zero most of the time. When both input signals equal to "0" (during the falling edges of the clock), temporary floating at node is basically harmless. At the rising edges of the clock, both transistors N2 and N3 are turned on and collaborate to pass a weak logic high to node, which then turns on transistor N1 by a time span defined by the delay inverter I1. The switching power at node can be reduced due to a diminished voltage swing. Unlike the MHLLF design, where the discharge control signal is driven by a single transistor, parallel conduction of two NMOS transistors (N2 and N3) speeds up the operations of pulse generation. With this design measure, the number of stacked transistors along the discharging path is reduced and the sizes of transistors N1-N5 can be reduced also.



Figure 1(d). P-FF Design with conditional pulse enhancement scheme

#### III. PROPOSED DESIGN

It adopts two measures to overcome the problems associated with existing P-FF designs. The first one is reducing the number of NMOS transistors stacked in the discharging path. The second one is supporting a mechanism to conditionally enhance the pull down strength when input data is "1."The upper part latch design is similar to the one employed in SCCER design.

#### Advantages:

Switching power is reduced by reducing the voltage swing.

- Speed of pulse generation is increased by using parallel NMOS transistors.
- Pull down strength is increased by using conditional pulses which switches the transistor only when needed.

#### A. CIRCUIT DIAGRAM

Transistor N2 is removed from the discharging path. Transistor N2, in conjunction with an additional transistor N3, forms a two-input pass transistor logic (PTL)-based AND gate to control the discharge of transistor N1. Since the two inputs to the AND logic are mostly complementary (except during the transition edges of the clock), the output node is kept at zero most of the time. When both input signals equal to "0", temporary floating at node is basically harmless.

At the rising edges of the clock, both transistors N2 and N3 are turned on and collaborate to pass a weak logic high to node which then turns on transistor N1 by a time span defined by the delay

inverterI1.



.Figure 2 .Proposed design

The switching power at node can be reduced due to a diminished voltage swing. Unlike the MHLLF design, where the discharge control signal is driven by a single transistor, parallel conduction of two NMOS transistors (N2 and N3) speeds up the operations of pulse generation. With this design measure, the number of stacked transistors along the discharging path is reduced and the sizes of transistors N1-N5 can be reduced also.

In this design, the longest discharging path is formed when input data is "1" while the Q-bar output is "1." To enhance the discharging under this condition, transistor P3 is added. Transistor P3 is normally turned off because node X is pulled high most of the time. It steps in when node X is discharged to VPP below the VDD. This provides an additional boost to node Z (from VDD-VTH to VDD). The generated pulse is taller, which enhances the pull-down strength of transistor N1.After the rising edge of the clock, the delay inverter I1 drives node back to zero through transistor N3 to shut down the discharging path. The voltage level of Node X rises and turns off transistor P3 eventually. With the intervention of P3, the width of the generated discharging pulse is stretched out. This means to create a pulse with sufficient width for correct data capturing, a bulky delay inverter design, which constitutes most of the power consumption in pulse generation logic, is no longer needed. It should be noted that this conditional pulse enhancement technique takes effects only when the FF output is subject to a data change from 0 to 1. Another benefit of this conditional pulse enhancement scheme is the

reduction in leakage power due to shrunken transistors in the critical discharging path and in the delay inverter.

To demonstrate the superiority of the proposed design, post layout simulations on various P-FF designs were conducted to obtain their performance figures. These designs include the three P-FF designs shown in Fig. 1 (ip-DCO, MHLLF, SCCER ), another P-FF design called conditional capture FF (CCFF), and two other non pulse triggered FF designs, i.e., a senseamplifier based FF (SAFF), and a conventional transmission gate-based FF (TGFF). The target technology is the UMC 90-nm CMOS process. The operating condition used in simulations is 500 MHz/1.0 V. Since pulse width design is crucial to the correctness of data capturing as well as the power consumption, the pulse generator logic in all designs are first sized to function properly across process variation. All designs are further optimized subject to the tradeoff between power and D-to-Q delay, i.e., minimizing the product of the two terms. Fig. 3 shows the simulation setup model. To mimic the signal rise and fall time delays, input signals are generated through buffers.

The output of the FF is loaded with a 20-FF capacitor. An extra capacitance of 3 FF is also placed after the clock buffer. To illustrate the merits of the presented work, Fig. 4 shows the simulation waveforms of the proposed P-FF design against the MHLLF design. In the proposed design, pulses of node are generated on every rising edge of the clock. Due to the extra voltage boost from transistor P3, pulses generated to capture input data "1" are significantly enhanced in their heights and widths compared with the pulses generated for capturing data "0" (0.84 V versus 0.65 V in height and 141 ps versus 84 ps in width). In the MHLL design, there is no such differentiation in their pulse generation. In addition, no signal degradation occurs in the internal node n of the proposed design. In contrast, the internal node in MHLLF design is degraded when Q equals to "0" and data equals to "1". Node Q thus deviates slightly from an intact value "0" and causes a DC power consumption at the output stage. From Fig. 4, the height of its pulses at node Z is around 0.68 V. Furthermore, node is floating when clock equals "0" and its value drifts gradually. To elaborate the power consumption behavior of these FF designs, five test patterns, each exhibiting a different data switching probability, are applied. Five of them are deterministic patterns with 0% (all-zero or all-one),



25%, 50%, and 100% data transition probabilities, respectively.

Due to a shorter discharging path and the employment of a conditional pulse enhancement scheme, the power consumption of the proposed design is the lowest in all test patterns. Take the test pattern with 50% data transition probability as an example, the power saving of proposed design ranges from 38.4% (against the ip-DCO design) to 5.6% (against the TGFF design). This savings is even more pronounced when operating at lower data switching activities, where the power consumption of pulse generation circuitry dominates. Because of a redundant switching power consumption problem at an internal node, the ip-DCO design has the largest power consumption when data switching activity is 0% (all 1). Fig. 5 shows the curves of power-delay-product versus setup time (for 50% data switching activity). The values of the proposed design are the smallest in all designs when the setup times are greater than 60 ps. Its minimum value occurs when the setup time is 53.9 ps and the corresponding to delay is 116.9 ps. The CCFF design is ranked in the second place in this evaluation with its optimal setup time as 67 ps. The setup time of the conventional TGFF design is always positive and has the smallest value when the setup time is 47 ps. In general, the MHLLF design has the worst performance due to the drawback of its latch structure. It shows the best performance of each design under different data switching activities.

The proposed design takes the lead in all types of data switching activity. The SCCER and the CCFF designs almost tie in the second place. It shows the performance of these designs at different process corners under the condition of 50% data switching activity. The performance edge of the proposed design is maintained as well. Notably, the MHLLF design has the worst performance especially at the SS process corner due to a large to delay and the poor driving capability of its pulse generation circuit. Table I also summarizes some important performance indexes of these P-FF designs. These include transistor count, layout area, and setup time, hold time, min to delay, optimal PDP, and the clock tree design. The MHLLF design exhibits the largest layout area because of an oversized pulse generation circuit. Following the measurement methods in, curves of to delay versus setup time and to delay versus hold time are simulated first. Setup time is defined as the point in the curve where to delay is the minimum. Hold time is measured at the point where the slope of the curve equals 1. The proposed design features the shortest minimum to delay. Its hold time is longer than other designs because the transistor (P3) for the pulse enhancement requires a prolonged availability of data input. The power drawn from the clock tree is calculated to evaluate the impact of FF loading on the clock jitter. Although the proposed FF design requires clock signal connected to the drain of transistor N2, the drawn current is not significant. Due to complementary switching behavior of N2 and N3, there exists no signal path from the entry of the clock signal. The clock tree is only liable for charging/discharging node Z. The optimal PDP value of the proposed design is also significantly better than other designs. The simulation results show that the clock tree power of the proposed design is close to those of the two leading designs (MHLFF and CCFF) and outperforms ip DCO, SCCER, TGFF, and SAFF, where clock signals connected to gates of the transistors only. The setup time is measured as the point where the minimum PDP value occurs. The setup times of these designs vary from to ps. Note that although the optimal setup time of the proposed design is 53.9 ps, its PDP value is lowest in all designs for any setup time greater than 60 ps. The to delay and the hold time are calculated subject to the optimal setup time. The to delay of the proposed design is second to the SCCER design only and outperforms the conventional TGFF design by a margin of 44.7%. The hold time requirement seems to be slightly larger due to a negative setup time. This number reduces as the setup time moves toward a positive value.

#### IV.MICROWIND DESIGN FLOW

MICROWIND supports entire front-end to back-end design flow. For front-end designing, we have DSCH (digital schematic editor) which posse's in-built pattern based simulator for digital circuits. User can also build analog circuits and convert them into SPICE files and use 3rd party simulators like pSPICE. DSCH can convert the digital circuits into Verilog file which can be further synthesized for FPGA/CPLD devices of any vendor. The same Verilog file can be compiled for layout conversion in MICROWIND.

The back-end design of circuits is supported by MICROWIND. User can design digital circuits and compile here using Verilog file. MICROWIND automatically generates a error free CMOS layout. Although this place-route is not optimized enough, we



do not indulge in complex place & route algorithms. User can also create CMOS layout of their own using compile one line Verilog syntax or custom build the layouts by manual drawing. The CMOS layouts can be verified using inbuilt mix-signal simulator and analyzed further for DRC, cross talks, delays, 2D cross section, 3D view, etc.

The MICROWIND software allows the designer to simulate and design an integrated circuit at physical description level. Microwind3 unifies schematic entry, pattern based simulator, SPICE extraction of schematic, Verilog extractor, layout compilation, on layout mix-signal circuit simulation, cross sectional & 3D viewer, net list extraction, BSIM4 tutorial on MOS devices and sign-off correlation to deliver unmatched design performance and designer productivity. The package contains a library of common logic and analog ICs to view and simulate.

#### V. SIMULATION RESULTS

A simulation window appears with inputs and output. The power consumption is also shown on the right bottom portion of the window. If you are unable to meet the specifications of the circuit change the transistor sizes. Generate the layout again and run the simulations till you achieve your target delays. Depending on the input sequences assigned at the input the output is observed in the simulation. To demonstrate the superiority of the proposed design, post layout simulations on various P-FF designs were conducted to obtain their performance figures.



FIG: 3.1(a) Ip-Dco microwind Circuit diagram



Fig:3.1(b) Ip- Dco screen shot



FIG: 3.2(a) Proposed microwind Circuit diagram



Fig 3.2 (b) Proposed screen shot

These designs include the three P-FF designs shown in Fig. 1 ip-DCO, MHLLF, SCCER, another P-FF design called conditional capture FF (CCFF), and two other non-pulse-triggered FF designs, i.e., a sense-amplifier-based FF (SAFF), and a conventional transmission gate-based FF (TGFF). The target technology is the UMC 90-nm CMOS process. The operating condition used in simulations is 500 MHz/1.0 V. Since pulse width design is crucial to the correctness of data capturing as well as the power consumption, the pulse generator logic in all designs are first sized to function





Fig: 3.3(b) SCCER Screen Shot



FIG: 3.4(a) Proposed microwind Circuit diagram



Fig 3.4 (b) Proposed screen shot

TABLE I COMPARISON TABLE

| FLIP<br>FLOP | Ip-DCO  | MHLLF   | SCCER   | Proposed |
|--------------|---------|---------|---------|----------|
| power        | 0.641mw | 0.549mw | 0.342mw | 92.31 μw |

#### V. CONCLUSION

In this project, we devise a novel low-power pulse-triggered FF design by employing two new design measures. The first one successfully reduces the number of transistors stacked along the discharging path by incorporating a PTL-based AND logic. The second one supports conditional enhancement to the height and width of the discharging pulse so that the size of the transistors in the pulse generation circuit can be kept minimum. Simulation results indicate that the proposed design excels rival designs in performance indexes such as power, to delay, and PDP. Coupled with these design merits is a longer hold-time requirement inherent in pulse-triggered FF designs. In future the counter has to be designed with the existing flip-flop designs and the schematics should be verified using the schematic editor of Micro wind and the layout generation is to be done. Impulses are to be provided and the appropriate power analysis is to be done.

#### REFERENCES

- [1] H. Kawaguchi and T. Sakurai, "A reduced clock-swing flip-flop (RCSFF) for 63% power reduction," *IEEE J. Solid-State Circuits*, vol.33, no. 5, pp. 807-811, May 1998.
- [2] A. G. M. Strollo, D. De Caro, E. Napoli, and N. Petra, "A novel high speed sense-amplifier-based flip-flop," *IEEE Trans. Very Large Scale Integr.* (VLSI) Syst., vol. 13, no. 11, pp. 1266-1274, Nov. 2005.
- [3] H. Partovi, R. Burd, U. Salim, F. Weber, L. DiGregorio, and D. Draper, "Flow-through latch and edge-triggered flip-flop hybrid Elements," in *IEEE Tech. Dig. ISSCC*, 1996, pp. 138-139 [4] F. Klass, C. Amir, A. Das, K. Aingaran, C. Truong, R. Wang, A. Mehta, R. Heald, and G. Yee, "A new family of semi-dynamic and Dynamic flip flops with embedded logic for high-performance Processors," *IEEE J. Solid-State Circuits*, vol. 34, no. 5, pp. 712-716, May 1999.
- [5] S. D. Naffziger, G. Colon-Bonet, T.Fischer, R. Riedlinger, T.J. Sullivan, And T.Grutkowski, "The implementation of the Itanium microprocessor," *IEEE J. Solid-State Circuits*, vol. 37, no. 11, p.1448-1460, Nov. 2002.
- [6] J. Tschanz, S. Narendra, Z. Chen, S. Borkar, M. Sachdev, and V.De, "Comparative delay and energy of single edge-triggered and Dual edge triggered pulsed flip-flops for high-performance Microprocessors," in Proc. ISPLED, 2001, pp. 207-212.
- [7] B. Kong, S. Kim, and Y. Jun, "Conditional-capture flip-flop for statis- tical power reduction," IEEE *J. Solid-State Circuits*, vol 36, no. 8, pp.1263-1271, Aug. 2001.
- [8] N. Nedovic, M. Aleksic, and V. G. Oklobdzija "Conditional Precharge techniques for power-efficient dual-edge clocking," in Proc. Int. Symp. Low-Power Electron. Design, Monterey, CA, Aug. 12-14, 2002, pp. 56-59.



### **BIOGRAPHY**

Mr.S.Belwin Joel did his Bachelor of Engineering degree in Electronics and Communication



Electronics and Communication
Engineering and his Master of
Engineering Degree in VLSI
DESIGN and now working as a
Assistant Professor in James college
of engineering and technology,

Nagercoil. He has taught subjects like VLSI Technology and Linear Integrated Circuits. He has presented many Research papers and Articles in various conferences and seminars.

